Improved Example-Based Speech Enhancement by Using Deep Neural Network Acoustic Model for Noise Robust Example Search

نویسندگان

  • Atsunori Ogawa
  • Keisuke Kinoshita
  • Marc Delcroix
  • Tomohiro Nakatani
چکیده

Example-based speech enhancement is a promising singlechannel approach for coping with highly nonstationary noise. Given a noisy speech input, it first searches in a noisy speech corpus for the noisy speech examples that best match the input. Then, it concatenates the clean speech examples that are paired with the matched noisy examples to obtain an estimate of the underlying clean speech component in the input. The quality of the enhanced speech depends on how accurate an example search can be performed given a noisy speech input. The example search is conventionally performed using a Gaussian mixture model (GMM) with mel-frequency cepstral coefficient features (MFCCs). To improve the noise robustness of the GMM-based example search, instead of using noise sensitive MFCCs, we have proposed using bottleneck features (BNFs), which are extracted from a deep neural network-based acoustic model (DNN-AM) built for automatic speech recognition. In this paper, instead of using a GMM with noise robust BNFs, we propose the direct use of a DNN-AM in the example search to further improve its noise robustness. Experimental results on the Aurora4 corpus show that the DNN-AM-based example search steadily improves the enhanced speech quality compared with the GMM-based example search using BNFs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Example Search Using Bottleneck Features for Example-Based Speech Enhancement

Example-based speech enhancement is a promising approach for coping with highly non-stationary noise. Given a noisy speech input, it first searches in noisy speech corpora for the noisy speech examples that best match the input. Then, it concatenates the clean speech examples that are paired with the matched noisy examples to obtain an estimate of the underlying clean speech component in the in...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Factored Deep Convolutional Neural Networks for Noise Robust Speech Recognition

In this paper, we present a framework of a factored deep convolutional neural network (CNN) learning for noise robust automatic speech recognition (ASR). Deep CNN architecture, which has attracted great attention in various research areas, has also been successfully applied to ASR. However, to ensure noise robustness, since merely introducing deep CNN architecture into the acoustic modeling of ...

متن کامل

Linear Prediction-based Dereverberation with Advanced Speech Enhancement and Recognition Technologies for the Reverb Challenge

This paper describes systems for the enhancement and recognition of distant speech recorded in reverberant rooms. Our speech enhancement (SE) system handles reverberation with blind deconvolution using linear filtering estimated by exploiting the temporal correlation of observed reverberant speech signals. Additional noise reduction is then performed using an MVDR beamformer and advanced model-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017